Pareto Front Learning (PFL) was recently introduced as an effective approach to obtain a mapping function from a given trade-off vector to a solution on the Pareto front, which solves the multi-objective optimization (MOO) problem. Due to the inherent trade-off between conflicting objectives, PFL offers a flexible approach in many scenarios in which the decision makers can not specify the preference of one Pareto solution over another, and must switch between them depending on the situation. However, existing PFL methods ignore the relationship between the solutions during the optimization process, which hinders the quality of the obtained front. To overcome this issue, we propose a novel PFL framework namely \ourmodel, which employs a hypernetwork to generate multiple solutions from a set of diverse trade-off preferences and enhance the quality of the Pareto front by maximizing the Hypervolume indicator defined by these solutions. The experimental results on several MOO machine learning tasks show that the proposed framework significantly outperforms the baselines in producing the trade-off Pareto front.
translated by 谷歌翻译
鉴于在各种条件和背景下捕获的图像的识别药物已经变得越来越重要。已经致力于利用基于深度学习的方法来解决文献中的药丸识别问题。但是,由于药丸的外观之间的相似性很高,因此经常发生错误识别,因此识别药丸是一个挑战。为此,在本文中,我们介绍了一种名为Pika的新颖方法,该方法利用外部知识来增强药丸识别精度。具体来说,我们解决了一种实用的情况(我们称之为上下文药丸识别),旨在在患者药丸摄入量的情况下识别药丸。首先,我们提出了一种新的方法,用于建模在存在外部数据源的情况下,在这种情况下,在存在外部处方的情况下,药丸之间的隐式关联。其次,我们提出了一个基于步行的图形嵌入模型,该模型从图形空间转换为矢量空间,并提取药丸的凝结关系。第三,提供了最终框架,该框架利用基于图像的视觉和基于图的关系特征来完成药丸识别任务。在此框架内,每种药丸的视觉表示形式都映射到图形嵌入空间,然后用来通过图表执行注意力,从而产生了有助于最终分类的语义丰富的上下文矢量。据我们所知,这是第一项使用外部处方数据来建立药物之间的关联并使用此帮助信息对其进行分类的研究。皮卡(Pika)的体系结构轻巧,并且具有将识别骨架纳入任何识别骨架的灵活性。实验结果表明,通过利用外部知识图,与基线相比,PIKA可以将识别精度从4.8%提高到34.1%。
translated by 谷歌翻译
多头注意力是最先进的变压器背后的推动力,它在各种自然语言处理(NLP)和计算机视觉任务中实现了出色的性能。已经观察到,对于许多应用,这些注意力头会学习冗余嵌入,并且大多数可以在不降低模型性能的情况下去除。受到这一观察的启发,我们提出了变压器的混合物(变压器-MGK)的混合物,这是一种新型的变压器架构,用每个头部的钥匙混合了变压器中的冗余头部。这些键的混合物遵循高斯混合模型,并使每个注意力头有效地集中在输入序列的不同部分上。与传统的变压器对应物相比,变压器-MGK会加速训练和推理,具有较少的参数,并且需要更少的拖船来计算,同时实现跨任务的可比性或更高的准确性。 Transformer-MGK也可以轻松扩展到线性注意力。我们从经验上证明了在一系列实用应用中变形金属MGK的优势,包括语言建模和涉及非常长序列的任务。在Wikitext-103和远程竞技场基准中,具有4个头部的变压器MGK具有与基线变压器具有8个头的可比性或更好的性能。
translated by 谷歌翻译
Diabetic Retinopathy (DR) is a leading cause of vision loss in the world, and early DR detection is necessary to prevent vision loss and support an appropriate treatment. In this work, we leverage interactive machine learning and introduce a joint learning framework, termed DRG-Net, to effectively learn both disease grading and multi-lesion segmentation. Our DRG-Net consists of two modules: (i) DRG-AI-System to classify DR Grading, localize lesion areas, and provide visual explanations; (ii) DRG-Expert-Interaction to receive feedback from user-expert and improve the DRG-AI-System. To deal with sparse data, we utilize transfer learning mechanisms to extract invariant feature representations by using Wasserstein distance and adversarial learning-based entropy minimization. Besides, we propose a novel attention strategy at both low- and high-level features to automatically select the most significant lesion information and provide explainable properties. In terms of human interaction, we further develop DRG-Net as a tool that enables expert users to correct the system's predictions, which may then be used to update the system as a whole. Moreover, thanks to the attention mechanism and loss functions constraint between lesion features and classification features, our approach can be robust given a certain level of noise in the feedback of users. We have benchmarked DRG-Net on the two largest DR datasets, i.e., IDRID and FGADR, and compared it to various state-of-the-art deep learning networks. In addition to outperforming other SOTA approaches, DRG-Net is effectively updated using user feedback, even in a weakly-supervised manner.
translated by 谷歌翻译
Air pollution is an emerging problem that needs to be solved especially in developed and developing countries. In Vietnam, air pollution is also a concerning issue in big cities such as Hanoi and Ho Chi Minh cities where air pollution comes mostly from vehicles such as cars and motorbikes. In order to tackle the problem, the paper focuses on developing a solution that can estimate the emitted PM2.5 pollutants by counting the number of vehicles in the traffic. We first investigated among the recent object detection models and developed our own traffic surveillance system. The observed traffic density showed a similar trend to the measured PM2.5 with a certain lagging in time, suggesting a relation between traffic density and PM2.5. We further express this relationship with a mathematical model which can estimate the PM2.5 value based on the observed traffic density. The estimated result showed a great correlation with the measured PM2.5 plots in the urban area context.
translated by 谷歌翻译
The instrumental variable (IV) approach is a widely used way to estimate the causal effects of a treatment on an outcome of interest from observational data with latent confounders. A standard IV is expected to be related to the treatment variable and independent of all other variables in the system. However, it is challenging to search for a standard IV from data directly due to the strict conditions. The conditional IV (CIV) method has been proposed to allow a variable to be an instrument conditioning on a set of variables, allowing a wider choice of possible IVs and enabling broader practical applications of the IV approach. Nevertheless, there is not a data-driven method to discover a CIV and its conditioning set directly from data. To fill this gap, in this paper, we propose to learn the representations of the information of a CIV and its conditioning set from data with latent confounders for average causal effect estimation. By taking advantage of deep generative models, we develop a novel data-driven approach for simultaneously learning the representation of a CIV from measured variables and generating the representation of its conditioning set given measured variables. Extensive experiments on synthetic and real-world datasets show that our method outperforms the existing IV methods.
translated by 谷歌翻译
成功的人工智能系统通常需要大量标记的数据来从文档图像中提取信息。在本文中,我们研究了改善人工智能系统在理解文档图像中的性能的问题,尤其是在培训数据受到限制的情况下。我们通过使用加强学习提出一种新颖的填充方法来解决问题。我们的方法将信息提取模型视为策略网络,并使用策略梯度培训来更新模型,以最大程度地提高补充传统跨凝结损失的综合奖励功能。我们使用标签和专家反馈在四个数据集上进行的实验表明,我们的填充机制始终提高最先进的信息提取器的性能,尤其是在小型培训数据制度中。
translated by 谷歌翻译
无数据知识蒸馏(DFKD)最近引起了人们的关注,这要归功于其在不使用培训数据的情况下将知识从教师网络转移到学生网络的吸引力。主要思想是使用发电机合成数据以培训学生。随着发电机的更新,合成数据的分布将发生变化。如果发电机和学生接受对手的训练,使学生忘记了先前一步获得的知识,则这种分配转换可能会很大。为了减轻这个问题,我们提出了一种简单而有效的方法,称为动量对抗蒸馏(MAD),该方法维持了发电机的指数移动平均值(EMA)副本,并使用发电机和EMA生成器的合成样品来培训学生。由于EMA发电机可以被视为发电机旧版本的合奏,并且与发电机相比,更新的更改通常会发生较小的变化,因此对其合成样本进行培训可以帮助学生回顾过去的知识,并防止学生适应太快的速度发电机的新更新。我们在六个基准数据集上进行的实验,包括ImageNet和Place365,表明MAD的性能优于竞争方法来处理大型分配转移问题。我们的方法还与现有的DFKD方法相比,甚至在某些情况下达到了最新的方法。
translated by 谷歌翻译
多分辨率的深度学习方法,例如U-NET体系结构,在分类和分割图像中已经达到了高性能。但是,这些方法不能提供潜在的图像表示形式,也不能用于分解,denoise和重建图像数据。 U-NET和其他卷积神经网络(CNNS)通常使用合并来扩大接受场,这通常会导致不可逆的信息丢失。这项研究建议包括riesz-quincunx(RQ)小波变换,结合1)高阶Riesz小波变换和2)在U-NET体系结构内正交Quincunx小波(两者都用于减少医学图像中的模糊) ,以减少卫星图像及其时间序列中的噪音。在变换的特征空间中,我们提出了一种变异方法,以了解特征的随机扰动如何影响图像以进一步降低噪声。结合两种方法,我们引入了一种用于减少卫星图像中噪声的图像和时间序列分解的混合Rqunet-VAE方案。我们提出了定性和定量的实验结果,表明与其他最先进的方法相比,我们提出的Rqunet-VAE在降低卫星图像中的噪声方面更有效。我们还将我们的方案应用于多波段卫星图像的多个应用程序,包括:通过扩散和图像分割分解图像denoising,图像和时间序列分解。
translated by 谷歌翻译
在科学研究和现实世界应用的许多领域中,非实验数据的因果效应的无偏估计对于理解数据的基础机制以及对有效响应或干预措施的决策至关重要。从不同角度对这个具有挑战性的问题进行了大量研究。对于数据中的因果效应估计,始终做出诸如马尔可夫财产,忠诚和因果关系之类的假设。在假设下,仍然需要一组协变量或基本因果图之类的全部知识。一个实用的挑战是,在许多应用程序中,没有这样的全部知识或只有某些部分知识。近年来,研究已经出现了基于图形因果模型的搜索策略,以从数据中发现有用的知识,以进行因果效应估计,并具有一些温和的假设,并在应对实际挑战方面表现出了诺言。在这项调查中,我们回顾了方法,并关注数据驱动方法所面临的挑战。我们讨论数据驱动方法的假设,优势和局限性。我们希望这篇综述将激励更多的研究人员根据图形因果建模设计更好的数据驱动方法,以解决因果效应估计的具有挑战性的问题。
translated by 谷歌翻译